-
Notifications
You must be signed in to change notification settings - Fork 110
Conversation
3843663
to
b54d873
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the flag should cause hive.Start
not to start the connect
loop
@@ -147,7 +147,7 @@ func TestDiscoverySimulationSimAdapter(t *testing.T) { | |||
testDiscoverySimulationSimAdapter(t, *nodeCount, *initCount) | |||
} | |||
|
|||
func TestDiscoveryPersistenceSimulationSimAdapter(t *testing.T) { | |||
func XTestDiscoveryPersistenceSimulationSimAdapter(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zelig could you maybe add a few comments on what this test is supposed to test?
I am also not sure how the vars persistenceEnabled
and discoveryEnabled
are supposed to work - what if tests are run concurrently?
It seems to me that this test is starting a network and connecting nodes in a chain, and then waiting for the Kademlia table to be setup by hive
. At this point it stops the nodes, stops discovery and starts them back up, expecting for the Kademlia table to be setup without discovery
.
If this is correct, it seems like we are changing the meaning of the NoDiscovery
flag - previously it was just preventing subPeerMsg
message exchanges, and not preventing actual connections, whereas now if we set the NoDiscovery
flag to true, we don't want hive
to trigger any connections what-so-ever.
Therefore if my assumption on this test is correct, it is expected for it to fail after the change.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well you seemed to want to redefine this flag to not use the address book that is saved just allow manual (or snapshot driven) connection.
This test is supposed to test that peers persist across sessions, i.e., that the addressbook is saved and used when bootstrapping connectivity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW the addressbook persistence is broken for a long time - if there are peers that no longer exist, we never clear them out, and we keep on trying to connect to them indefinitely - something we tried fixing with @homotopycolimit at some point, but didn't work nicely as a quick hack
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nonsense this test was my first coding task on swarm so i take responsibility :D
Discovery should indeed not impact on whether we are connecting to peer from our address book, so I don't see why this should impact the way we connect to existing known peers, and I'm not sure that disabling the test would help this fact at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, it makes more sense to have --no-hive
to mean no connect loop. So you could simply fix this test by wrapping hive service in a struct and redefine its Protocols function to return empty. That way you can get the hive service with connect loop but no protocol running, so you can safely test bootstrapping from addressbook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not redefine the Protocols
function to test persistence of the kademlia address book, this seems like a very ugly hack.
With this PR --no-hive-discovery
does not run the connect()
loop.
Let's talk about the address book as part of the Kademlia epic, this is out of scope for this PR. Both me and Elad seem to think that similar to how bootnodes are connected to outside of hive
, it makes sense to do something similar for the address book.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
address book persistence test should not be removed.
Maybe we need two flags: discoveryDisabled
and autoconnectDisabled
.
i am not sure of persistanceEnabled
, cannot see the usecase of disabling it
Yes, the definition of We obviously need a I am not sure if we need an |
Address book persistence in my opinion doesn't work well, so should be getting an issue for itself, and is probably out of scope for this PR. If there are peers in it that are not existent anymore, a node will try to connect to them indefinitely. I think it is more important to be able to deterministically build networks right now, so that we can improve our overall testing - something that @acud is having pain with in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nonsense let's re-approach this.
do you want to block the possibility of getting notified about new peers from your peers or would you like to disable address book persistence?
@@ -147,7 +147,7 @@ func TestDiscoverySimulationSimAdapter(t *testing.T) { | |||
testDiscoverySimulationSimAdapter(t, *nodeCount, *initCount) | |||
} | |||
|
|||
func TestDiscoveryPersistenceSimulationSimAdapter(t *testing.T) { | |||
func XTestDiscoveryPersistenceSimulationSimAdapter(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nonsense this test was my first coding task on swarm so i take responsibility :D
Discovery should indeed not impact on whether we are connecting to peer from our address book, so I don't see why this should impact the way we connect to existing known peers, and I'm not sure that disabling the test would help this fact at all
@acud this issue/PR is not at all about any of this. I don't see So the question is do we want to couple Ultimately I just want to be able to start Swarm without having |
@zelig @acud we spoke about this on a meeting 2-3 weeks ago, and I claimed that there is no way to create a deterministic network with Kademlia - I think this PR mostly shows that, because there is no way to disable hive and to stop it from making connections. So how do you want to complete this? I think the test that checks for persistence is wrong, because it checks for connectivity and not for saving/loading of peers. In general I think persistence should be improved as a larger effort, since it doesn't clear up old peers. Bottom line - I don't think the Having the possibility to create deterministic networks is quite important for all our efforts right now, so I'd suggest we go forward with this, and not wait for the Kademlia refactor we have on the roadmap. I think it'd be useful to have reproducible tests sooner rather than later. |
I think the basic misunderstanding here is the discovery flag. This actually has The persistence test semantics to this flag in this case is correct, since the content of the persisted address book should be bootstrapped to the node (or at least intercepted with a hook and asserted correctly). I agree that address book persistence should not be related to hive. If currently the address book content is loaded through hive then this is incorrect and should be done through the p2p server somehow. |
|
@zelig about 2: I'd rather not redefine the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor plus I think address book persistance simulation should be restored and adapted
api/config.go
Outdated
MaxStreamPeerServers int | ||
LightNodeEnabled bool | ||
BootnodeMode bool | ||
HiveDisableAutoConnect bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for Hive prefix
network/simulation/kademlia_test.go
Outdated
@@ -130,7 +130,7 @@ func createSimServiceMap(discovery bool) map[string]ServiceFunc { | |||
"bzz": func(ctx *adapters.ServiceContext, b *sync.Map) (node.Service, func(), error) { | |||
addr := network.NewAddr(ctx.Config.Node()) | |||
hp := network.NewHiveParams() | |||
hp.Discovery = discovery | |||
hp.AutoConnect = discovery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change the variable name too
@@ -186,26 +178,6 @@ func testDiscoverySimulation(t *testing.T, nodes, conns int, adapter adapters.No | |||
t.Logf("Setup: %s, shutdown: %s", result.StartedAt.Sub(startedAt), finishedAt.Sub(result.FinishedAt)) | |||
} | |||
|
|||
func testDiscoveryPersistenceSimulation(t *testing.T, nodes, conns int, adapter adapters.NodeAdapter) map[int][]byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with removing this test when you can easily fix it. Please
@@ -513,9 +328,9 @@ func newService(ctx *adapters.ServiceContext) (node.Service, error) { | |||
kad := network.NewKademlia(addr.Over(), kp) | |||
hp := network.NewHiveParams() | |||
hp.KeepAliveInterval = time.Duration(200) * time.Millisecond | |||
hp.Discovery = discoveryEnabled | |||
hp.AutoConnect = discoveryEnabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change the variable name too. otherwise we accumulate technical debt
network/simulations/overlay.go
Outdated
@@ -94,7 +94,7 @@ func (s *Simulation) NewService(ctx *adapters.ServiceContext) (node.Service, err | |||
kp.RetryInterval = 1000000 | |||
kad := network.NewKademlia(addr.Over(), kp) | |||
hp := network.NewHiveParams() | |||
hp.Discovery = !*noDiscovery | |||
hp.AutoConnect = !*noDiscovery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
b6fed64
to
1752f76
Compare
1752f76
to
25b1804
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allright so you are only adding a new flag. Fine.
* 'master' of github.com:ethersphere/swarm: (54 commits) api, chunk, cmd, shed, storage: add support for pinning content (ethersphere#1509) docs/swarm-guide: cleanup (ethersphere#1620) travis: split jobs into different stages (ethersphere#1615) simulation: retry if we hit a collision on tcp/udp ports (ethersphere#1616) api, chunk: rename Tag.New to Tag.Create (ethersphere#1614) pss: instrumentation and refactor (ethersphere#1580) api, cmd, network: add --disable-auto-connect flag (ethersphere#1576) changelog: fix typo (ethersphere#1605) version: update to v0.4.4 unstable (ethersphere#1603) swarm: release v0.4.3 (ethersphere#1602) network/retrieve: add bzz-retrieve protocol (ethersphere#1589) PoC: Network simulation framework (ethersphere#1555) network: structured output for kademlia table (ethersphere#1586) client: add bzz client, update smoke tests (ethersphere#1582) swarm-smoke: fix check max prox hosts for pull/push sync modes (ethersphere#1578) cmd/swarm: allow using a network interface by name for nat purposes (ethersphere#1557) pss: disable TestForwardBasic (ethersphere#1544) api, network: count chunk deliveries per peer (ethersphere#1534) network/newstream: new stream! protocol base implementation (ethersphere#1500) swarm: fix bzz_info.port when using dynamic port allocation (ethersphere#1537) ...
This PR is adding a
--disable-auto-connect
flag, so that we can run Swarm withouthive
discovery and reproduce the same connections between deployments, without relying on the non-deterministicSuggestPeer
functionality.Next step would be to add tools to extract and apply connections from running deployments (a very early attempt at #1183) and to generate snapshots out of them so that we can have more determinism in our tests.